This series of files compile all analyses done during Chapter 1 for the regional campaign of 2017:

All analyses have been done with PRIMER-e 6 and R 3.6.2.

Click on the table of contents in the left margin to assess a specific analysis.
Click on a figure to zoom it

To assess maps and figures, click here.
To go back to the summary page, click here.


We used data from subtidal ecosystems (see metadata files for more information). Only stations that have been sampled both for abiotic parameters and benthic species were included.

Selected variables for the analyses:

We only considered 500 µm communities for these analyses. Abundances of Bipalponephtys neotena (Bneo) and Nematoda (Nema) were also considered (see IndVal and SIMPER results).

Heavy metal concentrations for campaign 2017 have been kriged from the values collected at campaigns 2014 and 2016. As data is missing for metal concentrations outside BSI, two Designs have been used:


1. Data manipulation

For the following analyses, independant variables are habitat parameters and heavy metal concentrations, dependant variables are diversity indices.

1.1. Identification of outliers

To identify stations that are not consistent with the others, we used the multivariate Cook’s Distance (CD) on the uncorrelated variables. A significative threshold of 4 times the mean of CD has been established.

Design 1

Based on Cook’s Distance, we identified stations 188, 194 and 228 as general outliers. They have been deleted for the following analyses of Design 1.

Design 2

Based on Cook’s Distance, we identified stations 132 and 154 as general outliers. They have been deleted for the following analyses of Design 2.

1.2. Correlations between parameters

Correlations have been calculated with Spearman’s rank coefficient.

Design 1

Correlation coefficients between habitat parameters (Design 1)
  om gravel sand silt clay
om 1 -0.365 -0.791 0.877 0.119
gravel -0.365 1 0.015 -0.325 0.183
sand -0.791 0.015 1 -0.896 -0.47
silt 0.877 -0.325 -0.896 1 0.212
clay 0.119 0.183 -0.47 0.212 1

According to these results, the following variables are highly correlated (\(|\rho|\) > 0.80) so they have been considered together in the regressions of Design 1:

  • sand and silt (silt deleted)

Design 2

Correlation coefficients between heavy metals concentrations (Design 2)
  arsenic cadmium chromium copper iron manganese mercury lead zinc
arsenic 1 0.658 0.838 0.801 0.701 0.598 0.438 0.8 0.874
cadmium 0.658 1 0.809 0.467 0.719 0.794 0.288 0.806 0.777
chromium 0.838 0.809 1 0.811 0.871 0.84 0.406 0.889 0.945
copper 0.801 0.467 0.811 1 0.688 0.573 0.308 0.805 0.896
iron 0.701 0.719 0.871 0.688 1 0.834 0.182 0.71 0.848
manganese 0.598 0.794 0.84 0.573 0.834 1 0.208 0.679 0.77
mercury 0.438 0.288 0.406 0.308 0.182 0.208 1 0.488 0.37
lead 0.8 0.806 0.889 0.805 0.71 0.679 0.488 1 0.924
zinc 0.874 0.777 0.945 0.896 0.848 0.77 0.37 0.924 1

Many variables are highly correlated (\(|\rho|\) > 0.80), but we have considered the following together in the regressions of Design 2:

  • chromium, iron and manganese (iron and manganese deleted)
  • copper, lead and zinc (copper and zinc deleted)

We decided to keep arsenic, even though it is correlated with the copper/lead/zinc group, to stay consistant with the 2014 and 2016 campaigns.

2. Permutational Analyses of Covariance

Results of univariate PermANCOVAs on parameters and multivariate PermANCOVA on the whole benthic community with depth as covariate are presented in the table below. Variables were normalized and abundances were (log+1) transformed.

Variable Condition Depth
om S
gravel
sand
silt S
clay
S (500 µm) S
N (500 µm)
H (500 µm) S
J (500 µm) S
ALL SPECIES (500 µm) S S

3. Similarity and characteristic species

Let’s have a look at the \(\beta\) diversity within our conditions and sites.

Results of the PERMDISP routine are shown below (mean and SE of the deviation from centroid for each group, i.e. multivariate dispersion), along with the mean Bray-Curtis dissimilarity for each group. Abundances were (log+1) transformed and PRIMER was used to do the PERMDISP.

Mean within-group Bray-Curtis dissimilarity for each condition or site
  Mean deviation SE of deviation Mean BC dissimilarity
HI 57.5 1.64 0.829
R 48.8 2.43 0.71

Significative differences in dispersion have been detected between HI and R (p = 0.0053).

The following analyses allowed to detect species as characteristic of each condition. We used results from PRIMER to justify further their choice.

##                        cluster indicator_value probability
## bipalponephtys_neotena       1          0.6390       0.001
## macoma_calcarea              1          0.5583       0.004
## goniada_maculata             1          0.2917       0.028
## nematoda                     2          0.6565       0.005
## echinarachnius_parma         2          0.6524       0.001
## crenella_decussata           2          0.5836       0.001
## ecrobia_truncata             2          0.3333       0.005
## mesodesma_arctatum           2          0.2667       0.016
## solariella_sp                2          0.2000       0.039
## 
## Sum of probabilities                 =  55.401 
## 
## Sum of Indicator Values              =  13.26 
## 
## Sum of Significant Indicator Values  =  4.18 
## 
## Number of Significant Indicators     =  9 
## 
## Significant Indicator Distribution
## 
## 1 2 
## 3 6
SIMPER results (mean Bray-Curtis between-group dissimilarity: 0.872)
  average sd ratio ava avb cumsum
nematoda 0.0743 0.0621 1.2 0.879 2.08 0.0852
echinarachnius_parma 0.0618 0.0632 0.977 0.361 1.59 0.156
bipalponephtys_neotena 0.0556 0.0485 1.15 1.78 0.193 0.22
eudorellopsis_integra 0.0358 0.0494 0.724 1.08 0.119 0.261
crenella_decussata 0.0346 0.0352 0.983 0.0289 1.03 0.3
mesodesma_arctatum 0.0338 0.064 0.528 0 0.751 0.339
macoma_calcarea 0.0316 0.0323 0.977 1.03 0.0462 0.375
harpacticoida 0.0283 0.0325 0.872 0.801 0.258 0.408
phoxocephalus_holbolli 0.0282 0.0329 0.857 0.411 0.658 0.44
amphipoda 0.0234 0.0274 0.852 0.483 0.346 0.467
pholoe_sp 0.0215 0.0249 0.866 0.413 0.434 0.492
ameritella_agilis 0.018 0.0252 0.715 0.132 0.451 0.512
ecrobia_truncata 0.0168 0.0321 0.525 0 0.575 0.532
ennucula_tenuis 0.0164 0.022 0.742 0.401 0.212 0.55
ischyrocerus_anguipes 0.0155 0.0327 0.475 0.406 0.13 0.568
akanthophoreus_gracilis 0.0154 0.0263 0.584 0.389 0.193 0.586
hiatella_arctica 0.015 0.0309 0.483 0.0747 0.368 0.603
ostracoda 0.013 0.0195 0.664 0.207 0.258 0.618
cistenides_granulata 0.0129 0.0202 0.641 0.233 0.212 0.633
axinopsida_orbiculata 0.0117 0.025 0.47 0.361 0 0.646
thracia_septentrionalis 0.0117 0.0185 0.633 0.183 0.266 0.66
mysella_planulata 0.0117 0.0263 0.445 0 0.333 0.673
sabellidae_spp 0.0112 0.0323 0.348 0.361 0.0462 0.686
leucon_leucon_nasicoides 0.0107 0.0237 0.454 0.294 0 0.698
nephtyidae_spp 0.0107 0.0176 0.608 0.274 0.0462 0.71
orchomenella_minuta 0.0102 0.019 0.539 0.0578 0.212 0.722
nephtys_caeca 0.00966 0.0194 0.497 0.116 0.119 0.733
goniada_maculata 0.00965 0.0165 0.585 0.286 0 0.744
mytilus_sp 0.0092 0.0268 0.342 0.287 0 0.755
solariella_sp 0.0089 0.0205 0.435 0 0.29 0.765
pontoporeia_femorata 0.00864 0.0245 0.352 0.225 0 0.775
parvicardium_pinnulatum 0.00863 0.0164 0.527 0.0578 0.227 0.785
caprella_septentrionalis 0.00844 0.029 0.291 0.34 0 0.795
protomedeia_fasciata 0.00786 0.0165 0.478 0.241 0 0.804
lamprops_fuscatus 0.00694 0.0128 0.543 0.207 0.0462 0.811
pholoe_longa 0.00605 0.0189 0.319 0.11 0.0732 0.818
protomedeia_grandimana 0.00592 0.0173 0.342 0.213 0 0.825
astarte_sp 0.00588 0.013 0.454 0.0747 0.119 0.832
chone_sp 0.00567 0.0282 0.201 0.107 0 0.838
monoculopsis_longicornis 0.00559 0.0144 0.388 0.125 0.0924 0.845
ophelia_limacina 0.00551 0.0112 0.494 0.0289 0.166 0.851
glycera_sp 0.00542 0.0171 0.317 0.116 0 0.857
chaetodermatida 0.00541 0.0117 0.462 0.144 0.0462 0.864
nephtys_incisa 0.00535 0.0126 0.423 0.104 0.0462 0.87
maldanidae_spp 0.00472 0.0156 0.303 0.113 0.0462 0.875
aceroides_aceroides_latipes 0.00452 0.0111 0.406 0.144 0 0.88
quasimelita_formosa 0.00447 0.0122 0.366 0.125 0 0.885
sipuncula 0.00428 0.0122 0.35 0 0.0924 0.89
euchone_sp 0.00426 0.0207 0.206 0.146 0 0.895
anthozoa 0.00409 0.0111 0.369 0 0.119 0.9

4. Univariate regressions

We used linear models for the all regressions on diversity indices. Outliers and correlated variables were removed from these analyses.

4.1. Simple regressions

These analyses have been do to explore the relationships between variables. As it is a huge number of results to interpret, only multiple regressions will be included in the article (see below).

Design 1

Adjusted R-squared of simple regressions for Design 1
  om gravel sand clay
S_500 -0.02767 0.01197 -0.001571 -0.02831
N_500 -0.0185 -0.02098 -0.02294 0.002618
H_500 -0.009374 -0.0139 0.02245 0.02041
J_500 0.02241 -0.02941 0.005738 0.02843
p-values of simple regressions for Design 1
  om gravel sand clay
S_500 0.8117 0.241 0.3378 0.8497
N_500 0.5501 0.5997 0.6458 0.3034
H_500 0.417 0.4757 0.1882 0.1973
J_500 0.1883 0.9884 0.2806 0.1639

Design 2

Quitting from lines 258-277 (C1_analyses_17B.Rmd) Error in pandoc.table.return(…) : Wrong number of parameters (7 instead of 6) passed: justify De plus : Warning messages: 1: attribute variables are assumed to be spatially constant throughout all geometries 2: attribute variables are assumed to be spatially constant throughout all geometries 3: attribute variables are assumed to be spatially constant throughout all geometries 4: attribute variables are assumed to be spatially constant throughout all geometries 5: attribute variables are assumed to be spatially constant throughout all geometries 6: attribute variables are assumed to be spatially constant throughout all geometries 7: attribute variables are assumed to be spatially constant throughout all geometries 8: attribute variables are assumed to be spatially constant throughout all geometries Quitting from lines 258-277 (C1_analyses_17B.Rmd) Error in pandoc.table.return(…) : Wrong number of parameters (7 instead of 6) passed: justify

Furthermore, depth has been shown important for several parameters in the ANCOVAs, so here are the corresponding scatterplots.

4.2. Multiple regressions

This section presents analyses done (i) to determine which model (Design 1, Design 2) decribes the best the parameters and (ii) which variables are the most important to explain the parameters.

4.2.1. Best model selection

This step was not used here as both models were needed.

4.2.2. Significative variables selection

We identified which variables were selected after an AIC procedure to predict the best the parameters. Results of the variable selection, according to AIC, are shown on the tables below:

  • for the model of Design 1
Variable (or combination) S_500 N_500 H_500 J_500
om +
gravel +
sand/silt +
clay +
Adjusted \(R^{2}\) 0 0 0 0.06
  • for the model of Design 2
Variable (or combination) S_500 N_500 H_500 J_500
arsenic - - -
cadmium +
chromium/iron/manganese - - -
mercury - -
lead/copper/zinc + + +
Adjusted \(R^{2}\) 0.24 0.26 0.24 0.09

Details of the regressions, with diagnostics and cross-validation, are summarized below.

Design 1

Species richness
## FULL MODEL
## Adjusted R2 is: -0.06
Fitting linear model: S_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.46 10.1 1.63 0.1132
om -1.441 3.678 -0.3918 0.6979
gravel 3.448 11.32 0.3046 0.7627
sand -6.499 9.858 -0.6593 0.5146
clay -7.884 17.56 -0.4489 0.6566
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 54.74662
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: S_500 ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.81 0.8365 12.92 7.05e-15 * * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 4.980853

Quitting from lines 335-336 (C1_analyses_17B.Rmd) Error in Qr\(qr[p1, p1, drop = FALSE] : indice hors limites De plus : Warning messages: 1: In CVlm(data = full_model\)model, form.lm = full_model, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

2: In CVlm(data = reduced_model$model, form.lm = reduced_model, m = 5, :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

Total abundance
## FULL MODEL
## Adjusted R2 is: -0.06
Fitting linear model: N_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 210.2 164.1 1.281 0.2098
om -48.57 59.75 -0.8128 0.4225
gravel -52.99 183.9 -0.2882 0.7751
sand -117.8 160.2 -0.7354 0.4676
clay -323.7 285.3 -1.135 0.2652
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 1431.399
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: N_500 ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 81.67 13.57 6.018 7.313e-07 * * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 86.93245

Quitting from lines 346-347 (C1_analyses_17B.Rmd) Error in Qr\(qr[p1, p1, drop = FALSE] : indice hors limites De plus : Warning messages: 1: In CVlm(data = full_model\)model, form.lm = full_model, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

2: In CVlm(data = reduced_model$model, form.lm = reduced_model, m = 5, :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

Shannon index
## FULL MODEL
## Adjusted R2 is: -0.03
Fitting linear model: H_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.8243 1.189 0.6932 0.4934
om 0.357 0.433 0.8246 0.4159
gravel 1.322 1.333 0.9921 0.3288
sand 0.6177 1.161 0.5323 0.5983
clay 2.295 2.067 1.11 0.2756
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 17.33173
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: H_500 ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.627 0.1001 16.25 6.987e-18 * * *
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.6033001

Quitting from lines 357-358 (C1_analyses_17B.Rmd) Error in Qr\(qr[p1, p1, drop = FALSE] : indice hors limites De plus : Warning messages: 1: In CVlm(data = full_model\)model, form.lm = full_model, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

2: In CVlm(data = reduced_model$model, form.lm = reduced_model, m = 5, :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

Piélou’s evenness
## FULL MODEL
## Adjusted R2 is: 0.06
Fitting linear model: J_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04865 0.3676 0.1323 0.8956
om 0.2648 0.1339 1.978 0.05686
gravel 0.583 0.4119 1.415 0.1669
sand 0.5808 0.3588 1.619 0.1156
clay 1.303 0.6391 2.038 0.05012
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 5.628968
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0.06
Fitting linear model: J_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04865 0.3676 0.1323 0.8956
om 0.2648 0.1339 1.978 0.05686
gravel 0.583 0.4119 1.415 0.1669
sand 0.5808 0.3588 1.619 0.1156
clay 1.303 0.6391 2.038 0.05012
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 5.628968
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

Design 2

Species richness
## FULL MODEL
## Adjusted R2 is: 0.21
Fitting linear model: S_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 23.8 5.102 4.666 0.0002584 * * *
arsenic -1.844 1.359 -1.356 0.1938
cadmium -24.45 42.5 -0.5752 0.5731
chromium -0.1864 0.171 -1.09 0.2917
mercury -249 178.1 -1.398 0.1812
lead 2.31 1.714 1.347 0.1966
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 6.476836
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.24
Fitting linear model: S_500 ~ arsenic + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.94 3.86 5.683 2.694e-05 * * *
arsenic -1.859 1.332 -1.396 0.1807
chromium -0.2166 0.1595 -1.358 0.1923
mercury -228.2 170.9 -1.335 0.1994
lead 2.211 1.672 1.323 0.2035
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 5.727678
Variance Inflation Factors
  arsenic chromium mercury lead
VIF 2.11 2.99 1.2 3.67

Total abundance
## FULL MODEL
## Adjusted R2 is: 0.18
Fitting linear model: N_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 166.3 86.35 1.926 0.07209
arsenic -18.89 23 -0.8212 0.4236
cadmium 934.9 719.4 1.3 0.2121
chromium -0.3244 2.894 -0.1121 0.9121
mercury -4735 3014 -1.571 0.1358
lead -4.739 29.02 -0.1633 0.8723
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 100.7356
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.26
Fitting linear model: N_500 ~ arsenic + cadmium + mercury
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 173.6 76.52 2.268 0.03584 *
arsenic -24.78 14.68 -1.688 0.1086
cadmium 795.1 568.3 1.399 0.1788
mercury -5175 2583 -2.004 0.06036
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 90.95802
Variance Inflation Factors
  arsenic cadmium mercury
VIF 1.42 1.33 1.11

Shannon index
## FULL MODEL
## Adjusted R2 is: 0.24
Fitting linear model: H_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.153 0.598 5.273 7.572e-05 * * *
arsenic -0.2436 0.1593 -1.529 0.1458
cadmium -4.38 4.982 -0.8792 0.3923
chromium -0.03856 0.02004 -1.924 0.07233
mercury -26.63 20.87 -1.276 0.2203
lead 0.4741 0.2009 2.359 0.03135 *
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.8342848
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.24
Fitting linear model: H_500 ~ arsenic + chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.53 0.3833 6.602 3.363e-06 * * *
arsenic -0.2266 0.1584 -1.43 0.1698
chromium -0.04207 0.01902 -2.212 0.04012 *
lead 0.3896 0.191 2.04 0.05632
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.6644866
Variance Inflation Factors
  arsenic chromium lead
VIF 2.09 2.98 3.51

Piélou’s evenness
## FULL MODEL
## Adjusted R2 is: -0.07
Fitting linear model: J_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.8522 0.2505 3.403 0.003639 * *
arsenic -0.018 0.06672 -0.2697 0.7908
cadmium -0.2464 2.086 -0.1181 0.9075
chromium -0.01462 0.008394 -1.741 0.1008
mercury 0.6441 8.743 0.07367 0.9422
lead 0.1317 0.08416 1.565 0.1371
## FULL MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.3580101
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.09
Fitting linear model: J_500 ~ chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.8433 0.1476 5.712 1.663e-05 * * *
chromium -0.01498 0.007328 -2.045 0.055
lead 0.1212 0.06253 1.938 0.06757
## REDUCED MODEL
## Diagnostics: cf plots
## RMSE from cross-validation: 0.3273077
Variance Inflation Factors
  chromium lead
VIF 2.98 2.98

5. Multivariate regressions

Independant variables are habitat parameters or heavy metal concentrations, dependant variables are species abundances. Outliers and correlated variables have been excluded from the analysis.

This analysis has been done on PRIMER, with a DistLM to identify the variables that explain the most the community variability and with a dbRDA to plot the results.

Design 1

Variables selected by the DistLM procedure have a \(R^{2}\) of 0.23.

Design 2

Variables selected by the DistLM procedure have a \(R^{2}\) of 0.13.